Language model adaptation using minimum discrimination information

نویسنده

Wolfgang Reichl

چکیده

In this paper, adaptation of language models using the minimum discrimination information criteria is presented. Language model probabilities are adapted based on unigram, bigram and trigram features using a modified version of the generalized iterative scaling algorithm. Furthermore, a language model compression algorithm, based on conditional relative entropy is discussed. It removes probability terms from the language model, which can be closely approximated by back-off distributions. The proposed algorithms are used to adapt a mismatched, newspaper style language model to a natural language call routing task. The experiments show a significant reduction in perplexity and word error rate for small amounts of adaptation data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient language model adaptation through MDI estimation

This paper presents a method for n-gram language model adaptation based on the principle of minimum discrimination information. A background language model is adapted to t constraints on its marginal distributions that are derived from new observed data. This work gives a di erent derivation of the model by Kneser et al. (1997) and extends its application to interpolated language models. The pr...

متن کامل

Constraint selection for topic-based MDI adaptation of language models

This paper presents an unsupervised topic-based language model adaptation method which specializes the standard minimum information discrimination approach by identifying and combining topic-specific features. By acquiring a topic terminology from a thematically coherent corpus, language model adaptation is restrained to the sole probability re-estimation of n-grams ending with some topic-speci...

متن کامل

MDI adaptation for the lazy: avoiding normalization in LM adaptation for lecture translation

This paper provides a fast alternative to Minimum Discrimination Information-based language model adaptation for statistical machine translation. We provide an alternative to computing a normalization term that requires computing full model probabilities (including back-off probabilities) for all n-grams. Rather than re-estimating an entire language model, our Lazy MDI approach leverages a smoo...

متن کامل

Adaptive Language Modeling Using Minimum Discriminant Estimation

We present an algorithm to adapt a n-gram language model to a document as it is dictated. The observed partial document is used to estimate a unigram distribution for the words that already occurred. Then, we find the closest n-gram distribution to the static n.gram distribution (using the discrimination information distance measure) and that satisfies the marginal constraints derived from the ...

متن کامل

Topic Adaptation for Lecture Translation through Bilingual Latent Semantic Models

This work presents a simplified approach to bilingual topic modeling for language model adaptation by combining text in the source and target language into very short documents and performing Probabilistic Latent Semantic Analysis (PLSA) during model training. During inference, documents containing only the source language can be used to infer a full topic-word distribution on all words in the ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1999

Language model adaptation using minimum discrimination information

نویسنده

چکیده

منابع مشابه

Efficient language model adaptation through MDI estimation

Constraint selection for topic-based MDI adaptation of language models

MDI adaptation for the lazy: avoiding normalization in LM adaptation for lecture translation

Adaptive Language Modeling Using Minimum Discriminant Estimation

Topic Adaptation for Lecture Translation through Bilingual Latent Semantic Models

عنوان ژورنال:

اشتراک گذاری